Detection of L0-optimized attacks via anomaly scores distribution analysis
Annotation
The spread of artificial intelligence and machine learning is accompanied by an increase in the number of vulnerabilities and threats in systems implementing such technologies. Attacks based on malicious perturbations pose a significant threat to such systems. Various solutions have been developed to protect against them, including an approach to detecting L0- optimized attacks on neural image processing networks using statistical analysis methods and an algorithm for detecting such attacks by threshold clipping. The disadvantage of the threshold clipping algorithm is the need to determine the value of the parameter (cutoff threshold) to detect various attacks and take into account the specifics of the data sets, which makes it difficult to apply in practice. This article describes a method for detecting L0-optimized attacks on neural image processing networks through statistical analysis of the distribution of anomaly scores. To identify the distortion inherent in L0-optimized attacks, deviations from the nearest neighbors and Mahalanobis distances are determined. Based on their values, a matrix of pixel anomaly scores is calculated. It is assumed that the statistical distribution of pixel anomaly scores is different for attacked and non-attacked images and for perturbations embedded in various attacks. In this case, attacks can be detected by analyzing the statistical characteristics of the distribution of anomaly scores. The obtained characteristics are used as predictors for training anomaly detection and image classification models. The method was tested on the CIFAR-10, MNIST and ImageNet datasets. The developed method demonstrated the high quality of attack detection and classification. On the CIFAR-10 dataset, the accuracy of detecting attacks (anomalies) was 98.43 %, while the binary and multiclass classifications were 99.51 % and 99.07 %, respectively. Despite the fact that the accuracy of anomaly detection is lower than that of a multiclass classification, the method allows it to be used to distinguish fundamentally similar attacks that are not contained in the training sample. Only input data is used to detect and classify attacks, as a result of which the proposed method can potentially be used regardless of the architecture of the model or the presence of the target neural network. The method can be applied for detecting images distorted by L0-optimized attacks in a training sample.
Keywords
Постоянный URL
Articles in current issue
- Multispectral optoelectronic system
- Study of the influence of laser wavelength on the dichroism effect in ZnO:Ag films
- Direct laser thermochemical writing on titanium films for rasterized images creation
- Algorithms of direct output-feedback adaptive control of a linear system with finite time tuning
- Large language models in information security and penetration testing: a systematic review of application possibilities
- Usage of polar codes for fixed and random length error bursts correction
- Efficient sparse retrieval through embedding-based inverted index construction
- Method of semantic segmentation of airborne laser scanning data of water protection zones
- Directional variance-based algorithm for digital image smoothing
- DAS signal modeling using the generative adversarial neural network technique
- Multidimensional trajectory planning algorithm for a 5D printer slicer
- Scheduling distributed computations in non-deterministic systems
- Enhancing and extending CatBoost for accurate detection and classification of DoS and DDoS attack subtypes in network traffic
- Numerical study of SiO2 particle erosion of an aluminum alloy
- An approach to solving the problem of geomagnetic data scarcity in decision-making support
- Construction of matched distance function for simple Markov channel
- Application of the dynamic regressor extension and mixing approach in machine learning on the example of perceptron
- WaveVRF: post-quantum verifiable random function based on error-correcting codes